-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MV-473: [MV-MIOpen] Benchmark & Port Interpolate #41
MV-473: [MV-MIOpen] Benchmark & Port Interpolate #41
Conversation
Wow the improvement is very big. Could you provide script you used to benchmark rocm vs modnn + script benchmark rocm vs miopen to here or comment on your Jira ticket (recommend this method because recently our workflow required uploading benchmark script to comment of Jira ticket). Did you find any specific reasons that the improvement in some cases is very big? For example I saw linear backward is around 200x improvement compared to ROCm |
Very nice of you to remind me. I added the script to JIRA ticket. There are no special reasons; my MIOpen's performance is similar to Modnn's performance, and I made some constraints to eliminate low-performance cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial review
Wow the numbers look very impressive. But Some of that don't make sense to me, for example 99.29x in fp16 and 2.59x in fp32 (Bilinear bwd). Can you figure out why this much difference happen? It looks like one case with 1800x improvement ruined the geomean but please check. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some reviews I got from AMD and I think it will fit to your PR too
Run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial review
600728c
to
4d008c8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review for solvers + kernels (without backward bicubic because this kernel is more complicated than others, I will review it later). Overall it is all good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gtest + other things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some minor things in driver
Overall it looks good now. Make sure to check this comment #41 (comment) and update 5D backward nearest benchmark result after you create more test cases. Also I found that in the newest commit d15fbc5 you commented out |
I commented for re-run benchmark, updated 5D backward, check it out |
LGTM! Thank you for your hard work in this PR, and make sure to ask for review from other reviewers. |
@kyeonghwanryu @DuongQLee Do you want to add more reviews? If not, can I close this PR? |
Sorry it seems I missed the comment. I'm closing it. |
Added Interpolate forward and backward with multi-mode operation and kernel.
Added driver test and gtest for Interpolate .
New API is guarded by MIOPEN_BETA_API macro.
Compared to ROCm pytorch: Link
Average over all cases:
Mode Nearest